Dataset statistics
| Number of variables | 40 |
|---|---|
| Number of observations | 445132 |
| Missing cells | 902665 |
| Missing cells (%) | 5.1% |
| Duplicate rows | 145 |
| Duplicate rows (%) | < 0.1% |
| Total size in memory | 135.8 MiB |
| Average record size in memory | 320.0 B |
Variable types
| Text | 1 |
|---|---|
| Categorical | 11 |
| Numeric | 6 |
| Boolean | 22 |
| Dataset has 145 (< 0.1%) duplicate rows | Duplicates |
HadHeartAttack is highly imbalanced (68.5%) | Imbalance |
HadAngina is highly imbalanced (67.2%) | Imbalance |
HadStroke is highly imbalanced (74.2%) | Imbalance |
HadSkinCancer is highly imbalanced (59.7%) | Imbalance |
HadCOPD is highly imbalanced (59.6%) | Imbalance |
HadKidneyDisease is highly imbalanced (73.2%) | Imbalance |
HadDiabetes is highly imbalanced (59.9%) | Imbalance |
DeafOrHardOfHearing is highly imbalanced (55.8%) | Imbalance |
BlindOrVisionDifficulty is highly imbalanced (68.9%) | Imbalance |
DifficultyDressingBathing is highly imbalanced (75.8%) | Imbalance |
DifficultyErrands is highly imbalanced (60.7%) | Imbalance |
HighRiskLastYear is highly imbalanced (74.2%) | Imbalance |
PhysicalHealthDays has 10927 (2.5%) missing values | Missing |
MentalHealthDays has 9067 (2.0%) missing values | Missing |
LastCheckupTime has 8308 (1.9%) missing values | Missing |
SleepHours has 5453 (1.2%) missing values | Missing |
RemovedTeeth has 11360 (2.6%) missing values | Missing |
DeafOrHardOfHearing has 20647 (4.6%) missing values | Missing |
BlindOrVisionDifficulty has 21564 (4.8%) missing values | Missing |
DifficultyConcentrating has 24240 (5.4%) missing values | Missing |
DifficultyWalking has 24012 (5.4%) missing values | Missing |
DifficultyDressingBathing has 23915 (5.4%) missing values | Missing |
DifficultyErrands has 25656 (5.8%) missing values | Missing |
SmokerStatus has 35462 (8.0%) missing values | Missing |
ECigaretteUsage has 35660 (8.0%) missing values | Missing |
ChestScan has 56046 (12.6%) missing values | Missing |
RaceEthnicityCategory has 14057 (3.2%) missing values | Missing |
AgeCategory has 9079 (2.0%) missing values | Missing |
HeightInMeters has 28652 (6.4%) missing values | Missing |
WeightInKilograms has 42078 (9.5%) missing values | Missing |
BMI has 48806 (11.0%) missing values | Missing |
AlcoholDrinkers has 46574 (10.5%) missing values | Missing |
HIVTesting has 66127 (14.9%) missing values | Missing |
FluVaxLast12 has 47121 (10.6%) missing values | Missing |
PneumoVaxEver has 77040 (17.3%) missing values | Missing |
TetanusLast10Tdap has 82516 (18.5%) missing values | Missing |
HighRiskLastYear has 50623 (11.4%) missing values | Missing |
CovidPos has 50764 (11.4%) missing values | Missing |
PhysicalHealthDays has 267819 (60.2%) zeros | Zeros |
MentalHealthDays has 265229 (59.6%) zeros | Zeros |
Reproduction
| Analysis started | 2024-04-06 16:30:36.032901 |
|---|---|
| Analysis finished | 2024-04-06 16:31:03.164430 |
| Duration | 27.13 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
State
Text
| Distinct | 54 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.4 MiB |
Length
| Max length | 20 |
|---|---|
| Median length | 12 |
| Mean length | 8.350541 |
| Min length | 4 |
Characters and Unicode
| Total characters | 3717093 |
|---|---|
| Distinct characters | 46 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Alabama |
|---|---|
| 2nd row | Alabama |
| 3rd row | Alabama |
| 4th row | Alabama |
| 5th row | Alabama |
| Value | Count | Frequency (%) |
| new | 37524 | 7.0% |
| washington | 26152 | 4.9% |
| york | 17800 | 3.3% |
| south | 17461 | 3.3% |
| minnesota | 16821 | 3.2% |
| ohio | 16487 | 3.1% |
| maryland | 16418 | 3.1% |
| virginia | 15398 | 2.9% |
| carolina | 14542 | 2.7% |
| texas | 14245 | 2.7% |
| Other values (50) | 340315 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 479424 | |
| i | 353695 | 9.5% |
| n | 333499 | 9.0% |
| o | 314446 | 8.5% |
| s | 259106 | 7.0% |
| e | 216472 | 5.8% |
| r | 189964 | 5.1% |
| t | 168967 | 4.5% |
| h | 124376 | 3.3% |
| l | 108158 | 2.9% |
| Other values (36) | 1168986 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3099136 | |
| Uppercase Letter | 529926 | 14.3% |
| Space Separator | 88031 | 2.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 479424 | |
| i | 353695 | |
| n | 333499 | |
| o | 314446 | |
| s | 259106 | |
| e | 216472 | 7.0% |
| r | 189964 | 6.1% |
| t | 168967 | 5.5% |
| h | 124376 | 4.0% |
| l | 108158 | 3.5% |
| Other values (14) | 551029 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 88455 | |
| N | 56843 | |
| C | 47880 | 9.0% |
| W | 46551 | 8.8% |
| I | 37175 | 7.0% |
| O | 28018 | 5.3% |
| A | 25865 | 4.9% |
| V | 25740 | 4.9% |
| T | 19511 | 3.7% |
| D | 18801 | 3.5% |
| Other values (11) | 135087 |
Space Separator
| Value | Count | Frequency (%) |
| 88031 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3629062 | |
| Common | 88031 | 2.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 479424 | |
| i | 353695 | 9.7% |
| n | 333499 | 9.2% |
| o | 314446 | 8.7% |
| s | 259106 | 7.1% |
| e | 216472 | 6.0% |
| r | 189964 | 5.2% |
| t | 168967 | 4.7% |
| h | 124376 | 3.4% |
| l | 108158 | 3.0% |
| Other values (35) | 1080955 |
Common
| Value | Count | Frequency (%) |
| 88031 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3717093 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 479424 | |
| i | 353695 | 9.5% |
| n | 333499 | 9.0% |
| o | 314446 | 8.5% |
| s | 259106 | 7.0% |
| e | 216472 | 5.8% |
| r | 189964 | 5.1% |
| t | 168967 | 4.5% |
| h | 124376 | 3.3% |
| l | 108158 | 2.9% |
| Other values (36) | 1168986 |
Sex
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.4 MiB |
| Female | |
|---|---|
| Male |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.0598789 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2252314 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Female |
|---|---|
| 2nd row | Female |
| 3rd row | Female |
| 4th row | Female |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Female | 235893 | |
| Male | 209239 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| female | 235893 | |
| male | 209239 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 681025 | |
| a | 445132 | |
| l | 445132 | |
| F | 235893 | 10.5% |
| m | 235893 | 10.5% |
| M | 209239 | 9.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1807182 | |
| Uppercase Letter | 445132 | 19.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 681025 | |
| a | 445132 | |
| l | 445132 | |
| m | 235893 | 13.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 235893 | |
| M | 209239 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2252314 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 681025 | |
| a | 445132 | |
| l | 445132 | |
| F | 235893 | 10.5% |
| m | 235893 | 10.5% |
| M | 209239 | 9.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2252314 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 681025 | |
| a | 445132 | |
| l | 445132 | |
| F | 235893 | 10.5% |
| m | 235893 | 10.5% |
| M | 209239 | 9.3% |
GeneralHealth
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1198 |
| Missing (%) | 0.3% |
| Memory size | 3.4 MiB |
| Very good | |
|---|---|
| Good | |
| Excellent | |
| Fair | |
| Poor |
Length
| Max length | 9 |
|---|---|
| Median length | 4 |
| Mean length | 6.4814725 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2877346 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Very good |
|---|---|
| 2nd row | Excellent |
| 3rd row | Very good |
| 4th row | Excellent |
| 5th row | Fair |
Common Values
| Value | Count | Frequency (%) |
| Very good | 148444 | |
| Good | 143598 | |
| Excellent | 71878 | |
| Fair | 60273 | |
| Poor | 19741 | 4.4% |
| (Missing) | 1198 | 0.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| good | 292042 | |
| very | 148444 | |
| excellent | 71878 | 12.1% |
| fair | 60273 | 10.2% |
| poor | 19741 | 3.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 623566 | |
| e | 292200 | |
| d | 292042 | |
| r | 228458 | 7.9% |
| V | 148444 | 5.2% |
| y | 148444 | 5.2% |
| 148444 | 5.2% | |
| g | 148444 | 5.2% |
| l | 143756 | 5.0% |
| G | 143598 | 5.0% |
| Other values (9) | 559950 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2284968 | |
| Uppercase Letter | 443934 | 15.4% |
| Space Separator | 148444 | 5.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 623566 | |
| e | 292200 | |
| d | 292042 | |
| r | 228458 | 10.0% |
| y | 148444 | 6.5% |
| g | 148444 | 6.5% |
| l | 143756 | 6.3% |
| t | 71878 | 3.1% |
| n | 71878 | 3.1% |
| c | 71878 | 3.1% |
| Other values (3) | 192424 | 8.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| V | 148444 | |
| G | 143598 | |
| E | 71878 | |
| F | 60273 | |
| P | 19741 | 4.4% |
Space Separator
| Value | Count | Frequency (%) |
| 148444 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2728902 | |
| Common | 148444 | 5.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 623566 | |
| e | 292200 | |
| d | 292042 | |
| r | 228458 | 8.4% |
| V | 148444 | 5.4% |
| y | 148444 | 5.4% |
| g | 148444 | 5.4% |
| l | 143756 | 5.3% |
| G | 143598 | 5.3% |
| t | 71878 | 2.6% |
| Other values (8) | 488072 |
Common
| Value | Count | Frequency (%) |
| 148444 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2877346 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 623566 | |
| e | 292200 | |
| d | 292042 | |
| r | 228458 | 7.9% |
| V | 148444 | 5.2% |
| y | 148444 | 5.2% |
| 148444 | 5.2% | |
| g | 148444 | 5.2% |
| l | 143756 | 5.0% |
| G | 143598 | 5.0% |
| Other values (9) | 559950 |
PhysicalHealthDays
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 10927 |
| Missing (%) | 2.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.3479186 |
| Minimum | 0 |
|---|---|
| Maximum | 30 |
| Zeros | 267819 |
| Zeros (%) | 60.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3 |
| 95-th percentile | 30 |
| Maximum | 30 |
| Range | 30 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 8.688912 |
|---|---|
| Coefficient of variation (CV) | 1.9984072 |
| Kurtosis | 3.4275893 |
| Mean | 4.3479186 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.1798178 |
| Sum | 1887888 |
| Variance | 75.497192 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 267819 | |
| 30 | 33082 | 7.4% |
| 2 | 25256 | 5.7% |
| 1 | 17250 | 3.9% |
| 3 | 15948 | 3.6% |
| 5 | 15315 | 3.4% |
| 10 | 10589 | 2.4% |
| 7 | 9348 | 2.1% |
| 15 | 8787 | 2.0% |
| 4 | 8462 | 1.9% |
| Other values (21) | 22349 | 5.0% |
| (Missing) | 10927 | 2.5% |
| Value | Count | Frequency (%) |
| 0 | 267819 | |
| 1 | 17250 | 3.9% |
| 2 | 25256 | 5.7% |
| 3 | 15948 | 3.6% |
| 4 | 8462 | 1.9% |
| 5 | 15315 | 3.4% |
| 6 | 2538 | 0.6% |
| 7 | 9348 | 2.1% |
| 8 | 1761 | 0.4% |
| 9 | 411 | 0.1% |
| Value | Count | Frequency (%) |
| 30 | 33082 | |
| 29 | 365 | 0.1% |
| 28 | 751 | 0.2% |
| 27 | 188 | < 0.1% |
| 26 | 109 | < 0.1% |
| 25 | 2181 | 0.5% |
| 24 | 125 | < 0.1% |
| 23 | 99 | < 0.1% |
| 22 | 140 | < 0.1% |
| 21 | 1038 | 0.2% |
MentalHealthDays
Real number (ℝ)
MISSING  ZEROS 
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 9067 |
| Missing (%) | 2.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.3826494 |
| Minimum | 0 |
|---|---|
| Maximum | 30 |
| Zeros | 265229 |
| Zeros (%) | 59.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 30 |
| Range | 30 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 8.3874747 |
|---|---|
| Coefficient of variation (CV) | 1.9137909 |
| Kurtosis | 3.3592286 |
| Mean | 4.3826494 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.1232157 |
| Sum | 1911120 |
| Variance | 70.349731 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 265229 | |
| 30 | 26990 | 6.1% |
| 2 | 23785 | 5.3% |
| 5 | 19951 | 4.5% |
| 10 | 15414 | 3.5% |
| 3 | 15345 | 3.4% |
| 15 | 14519 | 3.3% |
| 1 | 14409 | 3.2% |
| 20 | 9150 | 2.1% |
| 4 | 7943 | 1.8% |
| Other values (21) | 23330 | 5.2% |
| (Missing) | 9067 | 2.0% |
| Value | Count | Frequency (%) |
| 0 | 265229 | |
| 1 | 14409 | 3.2% |
| 2 | 23785 | 5.3% |
| 3 | 15345 | 3.4% |
| 4 | 7943 | 1.8% |
| 5 | 19951 | 4.5% |
| 6 | 2305 | 0.5% |
| 7 | 7844 | 1.8% |
| 8 | 1749 | 0.4% |
| 9 | 322 | 0.1% |
| Value | Count | Frequency (%) |
| 30 | 26990 | |
| 29 | 502 | 0.1% |
| 28 | 910 | 0.2% |
| 27 | 241 | 0.1% |
| 26 | 106 | < 0.1% |
| 25 | 3078 | 0.7% |
| 24 | 124 | < 0.1% |
| 23 | 97 | < 0.1% |
| 22 | 193 | < 0.1% |
| 21 | 549 | 0.1% |
LastCheckupTime
Categorical
MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 8308 |
| Missing (%) | 1.9% |
| Memory size | 3.4 MiB |
| Within past year (anytime less than 12 months ago) | |
|---|---|
| Within past 2 years (1 year but less than 2 years ago) | |
| Within past 5 years (2 years but less than 5 years ago) | 24882 |
| 5 or more years ago | 19079 |
Length
| Max length | 55 |
|---|---|
| Median length | 50 |
| Mean length | 49.314683 |
| Min length | 19 |
Characters and Unicode
| Total characters | 21541837 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Within past year (anytime less than 12 months ago) |
|---|---|
| 2nd row | Within past year (anytime less than 12 months ago) |
| 3rd row | Within past year (anytime less than 12 months ago) |
| 4th row | Within past year (anytime less than 12 months ago) |
| 5th row | Within past year (anytime less than 12 months ago) |
Common Values
| Value | Count | Frequency (%) |
| Within past year (anytime less than 12 months ago) | 350944 | |
| Within past 2 years (1 year but less than 2 years ago) | 41919 | 9.4% |
| Within past 5 years (2 years but less than 5 years ago) | 24882 | 5.6% |
| 5 or more years ago | 19079 | 4.3% |
| (Missing) | 8308 | 1.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ago | 436824 | |
| within | 417745 | |
| past | 417745 | |
| less | 417745 | |
| than | 417745 | |
| year | 392863 | |
| anytime | 350944 | |
| 12 | 350944 | |
| months | 350944 | |
| years | 177563 | |
| Other values (6) | 324441 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3618679 | ||
| a | 2193684 | |
| t | 2021924 | |
| s | 1781742 | 8.3% |
| n | 1537378 | 7.1% |
| e | 1358194 | 6.3% |
| h | 1186434 | 5.5% |
| i | 1186434 | 5.5% |
| y | 921370 | 4.3% |
| o | 825926 | 3.8% |
| Other values (13) | 4910072 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 15748553 | |
| Space Separator | 3618679 | 16.8% |
| Decimal Number | 921370 | 4.3% |
| Close Punctuation | 417745 | 1.9% |
| Uppercase Letter | 417745 | 1.9% |
| Open Punctuation | 417745 | 1.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2193684 | |
| t | 2021924 | |
| s | 1781742 | |
| n | 1537378 | |
| e | 1358194 | |
| h | 1186434 | |
| i | 1186434 | |
| y | 921370 | |
| o | 825926 | 5.2% |
| m | 720967 | 4.6% |
| Other values (6) | 2014500 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 459664 | |
| 1 | 392863 | |
| 5 | 68843 | 7.5% |
Space Separator
| Value | Count | Frequency (%) |
| 3618679 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 417745 |
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 417745 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 417745 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 16166298 | |
| Common | 5375539 | 25.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 2193684 | |
| t | 2021924 | |
| s | 1781742 | |
| n | 1537378 | |
| e | 1358194 | |
| h | 1186434 | |
| i | 1186434 | |
| y | 921370 | 5.7% |
| o | 825926 | 5.1% |
| m | 720967 | 4.5% |
| Other values (7) | 2432245 |
Common
| Value | Count | Frequency (%) |
| 3618679 | ||
| 2 | 459664 | 8.6% |
| ) | 417745 | 7.8% |
| ( | 417745 | 7.8% |
| 1 | 392863 | 7.3% |
| 5 | 68843 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21541837 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3618679 | ||
| a | 2193684 | |
| t | 2021924 | |
| s | 1781742 | 8.3% |
| n | 1537378 | 7.1% |
| e | 1358194 | 6.3% |
| h | 1186434 | 5.5% |
| i | 1186434 | 5.5% |
| y | 921370 | 4.3% |
| o | 825926 | 3.8% |
| Other values (13) | 4910072 |
PhysicalActivities
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1093 |
| Missing (%) | 0.2% |
| Memory size | 869.5 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 1093 |
| Value | Count | Frequency (%) |
| True | 337559 | |
| False | 106480 | 23.9% |
| (Missing) | 1093 | 0.2% |
SleepHours
Real number (ℝ)
MISSING 
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 5453 |
| Missing (%) | 1.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.0229827 |
| Minimum | 1 |
|---|---|
| Maximum | 24 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 6 |
| median | 7 |
| Q3 | 8 |
| 95-th percentile | 9 |
| Maximum | 24 |
| Range | 23 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.502425 |
|---|---|
| Coefficient of variation (CV) | 0.21392976 |
| Kurtosis | 8.7411699 |
| Mean | 7.0229827 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.7646025 |
| Sum | 3087858 |
| Variance | 2.2572809 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 132927 | |
| 8 | 125442 | |
| 6 | 95880 | |
| 5 | 30122 | 6.8% |
| 9 | 21210 | 4.8% |
| 4 | 12433 | 2.8% |
| 10 | 10459 | 2.3% |
| 3 | 3260 | 0.7% |
| 12 | 3004 | 0.7% |
| 2 | 1549 | 0.3% |
| Other values (14) | 3393 | 0.8% |
| (Missing) | 5453 | 1.2% |
| Value | Count | Frequency (%) |
| 1 | 1154 | 0.3% |
| 2 | 1549 | 0.3% |
| 3 | 3260 | 0.7% |
| 4 | 12433 | 2.8% |
| 5 | 30122 | 6.8% |
| 6 | 95880 | |
| 7 | 132927 | |
| 8 | 125442 | |
| 9 | 21210 | 4.8% |
| 10 | 10459 | 2.3% |
| Value | Count | Frequency (%) |
| 24 | 52 | < 0.1% |
| 23 | 18 | < 0.1% |
| 22 | 19 | < 0.1% |
| 21 | 4 | < 0.1% |
| 20 | 143 | |
| 19 | 16 | < 0.1% |
| 18 | 168 | |
| 17 | 27 | < 0.1% |
| 16 | 329 | |
| 15 | 317 |
RemovedTeeth
Categorical
MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 11360 |
| Missing (%) | 2.6% |
| Memory size | 3.4 MiB |
| None of them | |
|---|---|
| 1 to 5 | |
| 6 or more, but not all | |
| All |
Length
| Max length | 22 |
|---|---|
| Median length | 12 |
| Mean length | 10.734033 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4656123 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None of them |
|---|---|
| 2nd row | None of them |
| 3rd row | 1 to 5 |
| 4th row | 6 or more, but not all |
| 5th row | None of them |
Common Values
| Value | Count | Frequency (%) |
| None of them | 233455 | |
| 1 to 5 | 129294 | |
| 6 or more, but not all | 45570 | 10.2% |
| All | 25453 | 5.7% |
| (Missing) | 11360 | 2.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| none | 233455 | |
| of | 233455 | |
| them | 233455 | |
| 1 | 129294 | |
| to | 129294 | |
| 5 | 129294 | |
| all | 71023 | 5.1% |
| 6 | 45570 | 3.3% |
| or | 45570 | 3.3% |
| more | 45570 | 3.3% |
| Other values (2) | 91140 | 6.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 953348 | ||
| o | 732914 | |
| e | 512480 | |
| t | 453889 | |
| n | 279025 | 6.0% |
| m | 279025 | 6.0% |
| N | 233455 | 5.0% |
| f | 233455 | 5.0% |
| h | 233455 | 5.0% |
| l | 142046 | 3.1% |
| Other values (9) | 603031 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3094139 | |
| Space Separator | 953348 | 20.5% |
| Decimal Number | 304158 | 6.5% |
| Uppercase Letter | 258908 | 5.6% |
| Other Punctuation | 45570 | 1.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 732914 | |
| e | 512480 | |
| t | 453889 | |
| n | 279025 | 9.0% |
| m | 279025 | 9.0% |
| f | 233455 | 7.5% |
| h | 233455 | 7.5% |
| l | 142046 | 4.6% |
| r | 91140 | 2.9% |
| b | 45570 | 1.5% |
| Other values (2) | 91140 | 2.9% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 129294 | |
| 5 | 129294 | |
| 6 | 45570 | 15.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 233455 | |
| A | 25453 | 9.8% |
Space Separator
| Value | Count | Frequency (%) |
| 953348 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 45570 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3353047 | |
| Common | 1303076 | 28.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 732914 | |
| e | 512480 | |
| t | 453889 | |
| n | 279025 | 8.3% |
| m | 279025 | 8.3% |
| N | 233455 | 7.0% |
| f | 233455 | 7.0% |
| h | 233455 | 7.0% |
| l | 142046 | 4.2% |
| r | 91140 | 2.7% |
| Other values (4) | 162163 | 4.8% |
Common
| Value | Count | Frequency (%) |
| 953348 | ||
| 1 | 129294 | 9.9% |
| 5 | 129294 | 9.9% |
| 6 | 45570 | 3.5% |
| , | 45570 | 3.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4656123 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 953348 | ||
| o | 732914 | |
| e | 512480 | |
| t | 453889 | |
| n | 279025 | 6.0% |
| m | 279025 | 6.0% |
| N | 233455 | 5.0% |
| f | 233455 | 5.0% |
| h | 233455 | 5.0% |
| l | 142046 | 3.1% |
| Other values (9) | 603031 |
HadHeartAttack
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3065 |
| Missing (%) | 0.7% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 25108 |
| (Missing) | 3065 |
| Value | Count | Frequency (%) |
| False | 416959 | |
| True | 25108 | 5.6% |
| (Missing) | 3065 | 0.7% |
HadAngina
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 4405 |
| Missing (%) | 1.0% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 26551 |
| (Missing) | 4405 |
| Value | Count | Frequency (%) |
| False | 414176 | |
| True | 26551 | 6.0% |
| (Missing) | 4405 | 1.0% |
HadStroke
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1557 |
| Missing (%) | 0.3% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 19239 |
| (Missing) | 1557 |
| Value | Count | Frequency (%) |
| False | 424336 | |
| True | 19239 | 4.3% |
| (Missing) | 1557 | 0.3% |
HadAsthma
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1773 |
| Missing (%) | 0.4% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 1773 |
| Value | Count | Frequency (%) |
| False | 376665 | |
| True | 66694 | 15.0% |
| (Missing) | 1773 | 0.4% |
HadSkinCancer
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3143 |
| Missing (%) | 0.7% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 35485 |
| (Missing) | 3143 |
| Value | Count | Frequency (%) |
| False | 406504 | |
| True | 35485 | 8.0% |
| (Missing) | 3143 | 0.7% |
HadCOPD
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2219 |
| Missing (%) | 0.5% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 35656 |
| (Missing) | 2219 |
| Value | Count | Frequency (%) |
| False | 407257 | |
| True | 35656 | 8.0% |
| (Missing) | 2219 | 0.5% |
HadDepressiveDisorder
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2812 |
| Missing (%) | 0.6% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 2812 |
| Value | Count | Frequency (%) |
| False | 350910 | |
| True | 91410 | 20.5% |
| (Missing) | 2812 | 0.6% |
HadKidneyDisease
Boolean
IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1926 |
| Missing (%) | 0.4% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 20315 |
| (Missing) | 1926 |
| Value | Count | Frequency (%) |
| False | 422891 | |
| True | 20315 | 4.6% |
| (Missing) | 1926 | 0.4% |
HadArthritis
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2633 |
| Missing (%) | 0.6% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 2633 |
| Value | Count | Frequency (%) |
| False | 291351 | |
| True | 151148 | |
| (Missing) | 2633 | 0.6% |
HadDiabetes
Categorical
IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1087 |
| Missing (%) | 0.2% |
| Memory size | 3.4 MiB |
| No | |
|---|---|
| Yes | |
| No, pre-diabetes or borderline diabetes | 10329 |
| Yes, but only during pregnancy (female) | 3836 |
Length
| Max length | 39 |
|---|---|
| Median length | 2 |
| Mean length | 3.3180263 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1473353 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Yes |
|---|---|
| 2nd row | No |
| 3rd row | No |
| 4th row | No |
| 5th row | No |
Common Values
| Value | Count | Frequency (%) |
| No | 368722 | |
| Yes | 61158 | 13.7% |
| No, pre-diabetes or borderline diabetes | 10329 | 2.3% |
| Yes, but only during pregnancy (female) | 3836 | 0.9% |
| (Missing) | 1087 | 0.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| no | 379051 | |
| yes | 64994 | 12.9% |
| pre-diabetes | 10329 | 2.0% |
| or | 10329 | 2.0% |
| borderline | 10329 | 2.0% |
| diabetes | 10329 | 2.0% |
| but | 3836 | 0.8% |
| only | 3836 | 0.8% |
| during | 3836 | 0.8% |
| pregnancy | 3836 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 403545 | |
| N | 379051 | |
| e | 148805 | 10.1% |
| s | 85652 | 5.8% |
| Y | 64994 | 4.4% |
| 60496 | 4.1% | |
| r | 48988 | 3.3% |
| d | 34823 | 2.4% |
| b | 34823 | 2.4% |
| i | 34823 | 2.4% |
| Other values (15) | 177353 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 936646 | |
| Uppercase Letter | 444045 | |
| Space Separator | 60496 | 4.1% |
| Other Punctuation | 14165 | 1.0% |
| Dash Punctuation | 10329 | 0.7% |
| Open Punctuation | 3836 | 0.3% |
| Close Punctuation | 3836 | 0.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 403545 | |
| e | 148805 | 15.9% |
| s | 85652 | 9.1% |
| r | 48988 | 5.2% |
| d | 34823 | 3.7% |
| b | 34823 | 3.7% |
| i | 34823 | 3.7% |
| a | 28330 | 3.0% |
| n | 25673 | 2.7% |
| t | 24494 | 2.6% |
| Other values (8) | 66690 | 7.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 379051 | |
| Y | 64994 | 14.6% |
Space Separator
| Value | Count | Frequency (%) |
| 60496 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 14165 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 10329 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3836 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3836 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1380691 | |
| Common | 92662 | 6.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 403545 | |
| N | 379051 | |
| e | 148805 | 10.8% |
| s | 85652 | 6.2% |
| Y | 64994 | 4.7% |
| r | 48988 | 3.5% |
| d | 34823 | 2.5% |
| b | 34823 | 2.5% |
| i | 34823 | 2.5% |
| a | 28330 | 2.1% |
| Other values (10) | 116857 | 8.5% |
Common
| Value | Count | Frequency (%) |
| 60496 | ||
| , | 14165 | 15.3% |
| - | 10329 | 11.1% |
| ( | 3836 | 4.1% |
| ) | 3836 | 4.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1473353 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 403545 | |
| N | 379051 | |
| e | 148805 | 10.1% |
| s | 85652 | 5.8% |
| Y | 64994 | 4.4% |
| 60496 | 4.1% | |
| r | 48988 | 3.3% |
| d | 34823 | 2.4% |
| b | 34823 | 2.4% |
| i | 34823 | 2.4% |
| Other values (15) | 177353 |
DeafOrHardOfHearing
Boolean
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 20647 |
| Missing (%) | 4.6% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 20647 |
| Value | Count | Frequency (%) |
| False | 385539 | |
| True | 38946 | 8.7% |
| (Missing) | 20647 | 4.6% |
BlindOrVisionDifficulty
Boolean
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 21564 |
| Missing (%) | 4.8% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 23658 |
| (Missing) | 21564 |
| Value | Count | Frequency (%) |
| False | 399910 | |
| True | 23658 | 5.3% |
| (Missing) | 21564 | 4.8% |
DifficultyConcentrating
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 24240 |
| Missing (%) | 5.4% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 24240 |
| Value | Count | Frequency (%) |
| False | 370792 | |
| True | 50100 | 11.3% |
| (Missing) | 24240 | 5.4% |
DifficultyWalking
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 24012 |
| Missing (%) | 5.4% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 24012 |
| Value | Count | Frequency (%) |
| False | 353039 | |
| True | 68081 | 15.3% |
| (Missing) | 24012 | 5.4% |
DifficultyDressingBathing
Boolean
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 23915 |
| Missing (%) | 5.4% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 16813 |
| (Missing) | 23915 |
| Value | Count | Frequency (%) |
| False | 404404 | |
| True | 16813 | 3.8% |
| (Missing) | 23915 | 5.4% |
DifficultyErrands
Boolean
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 25656 |
| Missing (%) | 5.8% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 32447 |
| (Missing) | 25656 |
| Value | Count | Frequency (%) |
| False | 387029 | |
| True | 32447 | 7.3% |
| (Missing) | 25656 | 5.8% |
SmokerStatus
Categorical
MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 35462 |
| Missing (%) | 8.0% |
| Memory size | 3.4 MiB |
| Never smoked | |
|---|---|
| Former smoker | |
| Current smoker - now smokes every day | |
| Current smoker - now smokes some days | 13938 |
Length
| Max length | 37 |
|---|---|
| Median length | 12 |
| Mean length | 15.325357 |
| Min length | 12 |
Characters and Unicode
| Total characters | 6278339 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Never smoked |
|---|---|
| 2nd row | Never smoked |
| 3rd row | Never smoked |
| 4th row | Current smoker - now smokes some days |
| 5th row | Never smoked |
Common Values
| Value | Count | Frequency (%) |
| Never smoked | 245955 | |
| Former smoker | 113774 | |
| Current smoker - now smokes every day | 36003 | 8.1% |
| Current smoker - now smokes some days | 13938 | 3.1% |
| (Missing) | 35462 | 8.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| never | 245955 | |
| smoked | 245955 | |
| smoker | 163715 | |
| former | 113774 | |
| current | 49941 | 4.7% |
| 49941 | 4.7% | |
| now | 49941 | 4.7% |
| smokes | 49941 | 4.7% |
| every | 36003 | 3.4% |
| day | 36003 | 3.4% |
| Other values (2) | 27876 | 2.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1201180 | |
| r | 773103 | |
| 659375 | ||
| o | 637264 | |
| m | 587323 | |
| s | 537428 | |
| k | 459611 | 7.3% |
| d | 295896 | 4.7% |
| v | 281958 | 4.5% |
| N | 245955 | 3.9% |
| Other values (9) | 599246 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5159353 | |
| Space Separator | 659375 | 10.5% |
| Uppercase Letter | 409670 | 6.5% |
| Dash Punctuation | 49941 | 0.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1201180 | |
| r | 773103 | |
| o | 637264 | |
| m | 587323 | |
| s | 537428 | |
| k | 459611 | 8.9% |
| d | 295896 | 5.7% |
| v | 281958 | 5.5% |
| n | 99882 | 1.9% |
| y | 85944 | 1.7% |
| Other values (4) | 199764 | 3.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 245955 | |
| F | 113774 | |
| C | 49941 | 12.2% |
Space Separator
| Value | Count | Frequency (%) |
| 659375 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 49941 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5569023 | |
| Common | 709316 | 11.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1201180 | |
| r | 773103 | |
| o | 637264 | |
| m | 587323 | |
| s | 537428 | |
| k | 459611 | 8.3% |
| d | 295896 | 5.3% |
| v | 281958 | 5.1% |
| N | 245955 | 4.4% |
| F | 113774 | 2.0% |
| Other values (7) | 435531 | 7.8% |
Common
| Value | Count | Frequency (%) |
| 659375 | ||
| - | 49941 | 7.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6278339 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1201180 | |
| r | 773103 | |
| 659375 | ||
| o | 637264 | |
| m | 587323 | |
| s | 537428 | |
| k | 459611 | 7.3% |
| d | 295896 | 4.7% |
| v | 281958 | 4.5% |
| N | 245955 | 3.9% |
| Other values (9) | 599246 |
ECigaretteUsage
Categorical
MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 35660 |
| Missing (%) | 8.0% |
| Memory size | 3.4 MiB |
| Never used e-cigarettes in my entire life | |
|---|---|
| Not at all (right now) | |
| Use them some days | 11734 |
| Use them every day | 10382 |
Length
| Max length | 41 |
|---|---|
| Median length | 41 |
| Mean length | 36.260579 |
| Min length | 18 |
Characters and Unicode
| Total characters | 14847692 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Not at all (right now) |
|---|---|
| 2nd row | Never used e-cigarettes in my entire life |
| 3rd row | Never used e-cigarettes in my entire life |
| 4th row | Never used e-cigarettes in my entire life |
| 5th row | Never used e-cigarettes in my entire life |
Common Values
| Value | Count | Frequency (%) |
| Never used e-cigarettes in my entire life | 311988 | |
| Not at all (right now) | 75368 | 16.9% |
| Use them some days | 11734 | 2.6% |
| Use them every day | 10382 | 2.3% |
| (Missing) | 35660 | 8.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| never | 311988 | |
| used | 311988 | |
| e-cigarettes | 311988 | |
| in | 311988 | |
| my | 311988 | |
| entire | 311988 | |
| life | 311988 | |
| now | 75368 | 2.8% |
| right | 75368 | 2.8% |
| all | 75368 | 2.8% |
| Other values (8) | 239200 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2884622 | |
| 2239748 | ||
| i | 1323320 | 8.9% |
| t | 1184184 | 8.0% |
| r | 1021714 | 6.9% |
| n | 699344 | 4.7% |
| s | 669560 | 4.5% |
| a | 484840 | 3.3% |
| l | 462724 | 3.1% |
| g | 387356 | 2.6% |
| Other values (15) | 3490280 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11735748 | |
| Space Separator | 2239748 | 15.1% |
| Uppercase Letter | 409472 | 2.8% |
| Dash Punctuation | 311988 | 2.1% |
| Open Punctuation | 75368 | 0.5% |
| Close Punctuation | 75368 | 0.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2884622 | |
| i | 1323320 | |
| t | 1184184 | |
| r | 1021714 | 8.7% |
| n | 699344 | 6.0% |
| s | 669560 | 5.7% |
| a | 484840 | 4.1% |
| l | 462724 | 3.9% |
| g | 387356 | 3.3% |
| m | 345838 | 2.9% |
| Other values (9) | 2272246 |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 387356 | |
| U | 22116 | 5.4% |
Space Separator
| Value | Count | Frequency (%) |
| 2239748 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 311988 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 75368 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 75368 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12145220 | |
| Common | 2702472 | 18.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2884622 | |
| i | 1323320 | |
| t | 1184184 | |
| r | 1021714 | 8.4% |
| n | 699344 | 5.8% |
| s | 669560 | 5.5% |
| a | 484840 | 4.0% |
| l | 462724 | 3.8% |
| g | 387356 | 3.2% |
| N | 387356 | 3.2% |
| Other values (11) | 2640200 |
Common
| Value | Count | Frequency (%) |
| 2239748 | ||
| - | 311988 | 11.5% |
| ( | 75368 | 2.8% |
| ) | 75368 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14847692 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 2884622 | |
| 2239748 | ||
| i | 1323320 | 8.9% |
| t | 1184184 | 8.0% |
| r | 1021714 | 6.9% |
| n | 699344 | 4.7% |
| s | 669560 | 4.5% |
| a | 484840 | 3.3% |
| l | 462724 | 3.1% |
| g | 387356 | 2.6% |
| Other values (15) | 3490280 |
ChestScan
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 56046 |
| Missing (%) | 12.6% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 223221 | |
| True | 165865 | |
| (Missing) | 56046 | 12.6% |
RaceEthnicityCategory
Categorical
MISSING 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 14057 |
| Missing (%) | 3.2% |
| Memory size | 3.4 MiB |
| White only, Non-Hispanic | |
|---|---|
| Hispanic | |
| Black only, Non-Hispanic | |
| Other race only, Non-Hispanic | 22713 |
| Multiracial, Non-Hispanic | 9578 |
Length
| Max length | 29 |
|---|---|
| Median length | 24 |
| Mean length | 22.692736 |
| Min length | 8 |
Characters and Unicode
| Total characters | 9782271 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | White only, Non-Hispanic |
|---|---|
| 2nd row | White only, Non-Hispanic |
| 3rd row | White only, Non-Hispanic |
| 4th row | White only, Non-Hispanic |
| 5th row | White only, Non-Hispanic |
Common Values
| Value | Count | Frequency (%) |
| White only, Non-Hispanic | 320421 | |
| Hispanic | 42917 | 9.6% |
| Black only, Non-Hispanic | 35446 | 8.0% |
| Other race only, Non-Hispanic | 22713 | 5.1% |
| Multiracial, Non-Hispanic | 9578 | 2.2% |
| (Missing) | 14057 | 3.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| non-hispanic | 388158 | |
| only | 378580 | |
| white | 320421 | |
| hispanic | 42917 | 3.5% |
| black | 35446 | 2.9% |
| other | 22713 | 1.9% |
| race | 22713 | 1.9% |
| multiracial | 9578 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 1201727 | 12.3% |
| n | 1197813 | 12.2% |
| 789451 | 8.1% | |
| o | 766738 | 7.8% |
| a | 508390 | 5.2% |
| c | 498812 | 5.1% |
| l | 433182 | 4.4% |
| H | 431075 | 4.4% |
| s | 431075 | 4.4% |
| p | 431075 | 4.4% |
| Other values (14) | 3092933 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7009113 | |
| Uppercase Letter | 1207391 | 12.3% |
| Space Separator | 789451 | 8.1% |
| Other Punctuation | 388158 | 4.0% |
| Dash Punctuation | 388158 | 4.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 1201727 | |
| n | 1197813 | |
| o | 766738 | |
| a | 508390 | |
| c | 498812 | |
| l | 433182 | 6.2% |
| s | 431075 | 6.2% |
| p | 431075 | 6.2% |
| y | 378580 | 5.4% |
| e | 365847 | 5.2% |
| Other values (5) | 795874 |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 431075 | |
| N | 388158 | |
| W | 320421 | |
| B | 35446 | 2.9% |
| O | 22713 | 1.9% |
| M | 9578 | 0.8% |
Space Separator
| Value | Count | Frequency (%) |
| 789451 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 388158 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 388158 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8216504 | |
| Common | 1565767 | 16.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 1201727 | |
| n | 1197813 | |
| o | 766738 | 9.3% |
| a | 508390 | 6.2% |
| c | 498812 | 6.1% |
| l | 433182 | 5.3% |
| H | 431075 | 5.2% |
| s | 431075 | 5.2% |
| p | 431075 | 5.2% |
| N | 388158 | 4.7% |
| Other values (11) | 1928459 |
Common
| Value | Count | Frequency (%) |
| 789451 | ||
| , | 388158 | |
| - | 388158 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9782271 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 1201727 | 12.3% |
| n | 1197813 | 12.2% |
| 789451 | 8.1% | |
| o | 766738 | 7.8% |
| a | 508390 | 5.2% |
| c | 498812 | 5.1% |
| l | 433182 | 4.4% |
| H | 431075 | 4.4% |
| s | 431075 | 4.4% |
| p | 431075 | 4.4% |
| Other values (14) | 3092933 |
AgeCategory
Categorical
MISSING 
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 9079 |
| Missing (%) | 2.0% |
| Memory size | 3.4 MiB |
| Age 65 to 69 | |
|---|---|
| Age 60 to 64 | |
| Age 70 to 74 | |
| Age 55 to 59 | |
| Age 80 or older | |
| Other values (8) |
Length
| Max length | 15 |
|---|---|
| Median length | 12 |
| Mean length | 12.249403 |
| Min length | 12 |
Characters and Unicode
| Total characters | 5341389 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Age 80 or older |
|---|---|
| 2nd row | Age 80 or older |
| 3rd row | Age 55 to 59 |
| 4th row | Age 40 to 44 |
| 5th row | Age 80 or older |
Common Values
| Value | Count | Frequency (%) |
| Age 65 to 69 | 47099 | |
| Age 60 to 64 | 44511 | |
| Age 70 to 74 | 43472 | |
| Age 55 to 59 | 36821 | |
| Age 80 or older | 36251 | |
| Age 50 to 54 | 33644 | |
| Age 75 to 79 | 32518 | |
| Age 40 to 44 | 29942 | |
| Age 45 to 49 | 28531 | 6.4% |
| Age 35 to 39 | 28526 | 6.4% |
| Other values (3) | 74738 |
Length
| Value | Count | Frequency (%) |
| age | 436053 | |
| to | 399802 | |
| 65 | 47099 | 2.7% |
| 69 | 47099 | 2.7% |
| 60 | 44511 | 2.6% |
| 64 | 44511 | 2.6% |
| 70 | 43472 | 2.5% |
| 74 | 43472 | 2.5% |
| 55 | 36821 | 2.1% |
| 59 | 36821 | 2.1% |
| Other values (19) | 564551 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1308159 | ||
| e | 472304 | 8.8% |
| o | 472304 | 8.8% |
| A | 436053 | 8.2% |
| g | 436053 | 8.2% |
| t | 399802 | 7.5% |
| 5 | 336415 | 6.3% |
| 4 | 321263 | 6.0% |
| 0 | 213627 | 4.0% |
| 9 | 195485 | 3.7% |
| Other values (9) | 749924 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1925467 | |
| Decimal Number | 1671710 | |
| Space Separator | 1308159 | |
| Uppercase Letter | 436053 | 8.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 336415 | |
| 4 | 321263 | |
| 0 | 213627 | |
| 9 | 195485 | |
| 6 | 183220 | |
| 7 | 151980 | |
| 3 | 108666 | 6.5% |
| 2 | 70921 | 4.2% |
| 8 | 63192 | 3.8% |
| 1 | 26941 | 1.6% |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 472304 | |
| o | 472304 | |
| g | 436053 | |
| t | 399802 | |
| r | 72502 | 3.8% |
| l | 36251 | 1.9% |
| d | 36251 | 1.9% |
Space Separator
| Value | Count | Frequency (%) |
| 1308159 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 436053 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2979869 | |
| Latin | 2361520 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1308159 | ||
| 5 | 336415 | 11.3% |
| 4 | 321263 | 10.8% |
| 0 | 213627 | 7.2% |
| 9 | 195485 | 6.6% |
| 6 | 183220 | 6.1% |
| 7 | 151980 | 5.1% |
| 3 | 108666 | 3.6% |
| 2 | 70921 | 2.4% |
| 8 | 63192 | 2.1% |
Latin
| Value | Count | Frequency (%) |
| e | 472304 | |
| o | 472304 | |
| A | 436053 | |
| g | 436053 | |
| t | 399802 | |
| r | 72502 | 3.1% |
| l | 36251 | 1.5% |
| d | 36251 | 1.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5341389 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1308159 | ||
| e | 472304 | 8.8% |
| o | 472304 | 8.8% |
| A | 436053 | 8.2% |
| g | 436053 | 8.2% |
| t | 399802 | 7.5% |
| 5 | 336415 | 6.3% |
| 4 | 321263 | 6.0% |
| 0 | 213627 | 4.0% |
| 9 | 195485 | 3.7% |
| Other values (9) | 749924 |
HeightInMeters
Real number (ℝ)
MISSING 
| Distinct | 109 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 28652 |
| Missing (%) | 6.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7026906 |
| Minimum | 0.91 |
|---|---|
| Maximum | 2.41 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.4 MiB |
Quantile statistics
| Minimum | 0.91 |
|---|---|
| 5-th percentile | 1.52 |
| Q1 | 1.63 |
| median | 1.7 |
| Q3 | 1.78 |
| 95-th percentile | 1.88 |
| Maximum | 2.41 |
| Range | 1.5 |
| Interquartile range (IQR) | 0.15 |
Descriptive statistics
| Standard deviation | 0.1071775 |
|---|---|
| Coefficient of variation (CV) | 0.062945964 |
| Kurtosis | 0.18229935 |
| Mean | 1.7026906 |
| Median Absolute Deviation (MAD) | 0.08 |
| Skewness | 0.028899535 |
| Sum | 709136.57 |
| Variance | 0.011487016 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.68 | 36782 | 8.3% |
| 1.63 | 35622 | 8.0% |
| 1.7 | 34038 | 7.6% |
| 1.65 | 32785 | 7.4% |
| 1.78 | 32038 | 7.2% |
| 1.73 | 30910 | 6.9% |
| 1.75 | 29157 | 6.6% |
| 1.6 | 28296 | 6.4% |
| 1.83 | 28294 | 6.4% |
| 1.57 | 26944 | 6.1% |
| Other values (99) | 101614 | |
| (Missing) | 28652 | 6.4% |
| Value | Count | Frequency (%) |
| 0.91 | 24 | |
| 0.92 | 1 | < 0.1% |
| 0.95 | 1 | < 0.1% |
| 0.97 | 4 | < 0.1% |
| 0.99 | 1 | < 0.1% |
| 1 | 4 | < 0.1% |
| 1.02 | 3 | < 0.1% |
| 1.03 | 3 | < 0.1% |
| 1.04 | 18 | |
| 1.05 | 29 |
| Value | Count | Frequency (%) |
| 2.41 | 5 | < 0.1% |
| 2.36 | 1 | < 0.1% |
| 2.34 | 4 | < 0.1% |
| 2.29 | 5 | < 0.1% |
| 2.26 | 11 | < 0.1% |
| 2.24 | 2 | < 0.1% |
| 2.21 | 9 | < 0.1% |
| 2.18 | 10 | < 0.1% |
| 2.16 | 10 | < 0.1% |
| 2.13 | 29 |
WeightInKilograms
Real number (ℝ)
MISSING 
| Distinct | 599 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 42078 |
| Missing (%) | 9.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83.07447 |
| Minimum | 22.68 |
|---|---|
| Maximum | 292.57 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.4 MiB |
Quantile statistics
| Minimum | 22.68 |
|---|---|
| 5-th percentile | 54.43 |
| Q1 | 68.04 |
| median | 80.74 |
| Q3 | 95.25 |
| 95-th percentile | 122.47 |
| Maximum | 292.57 |
| Range | 269.89 |
| Interquartile range (IQR) | 27.21 |
Descriptive statistics
| Standard deviation | 21.448173 |
|---|---|
| Coefficient of variation (CV) | 0.25818007 |
| Kurtosis | 2.7389723 |
| Mean | 83.07447 |
| Median Absolute Deviation (MAD) | 12.7 |
| Skewness | 1.0756118 |
| Sum | 33483498 |
| Variance | 460.02411 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90.72 | 21311 | 4.8% |
| 81.65 | 19709 | 4.4% |
| 68.04 | 17595 | 4.0% |
| 72.57 | 17177 | 3.9% |
| 77.11 | 15979 | 3.6% |
| 86.18 | 14202 | 3.2% |
| 63.5 | 12924 | 2.9% |
| 79.38 | 11722 | 2.6% |
| 99.79 | 10890 | 2.4% |
| 74.84 | 10809 | 2.4% |
| Other values (589) | 250736 | |
| (Missing) | 42078 | 9.5% |
| Value | Count | Frequency (%) |
| 22.68 | 10 | |
| 23 | 1 | < 0.1% |
| 23.13 | 1 | < 0.1% |
| 23.59 | 2 | < 0.1% |
| 24 | 1 | < 0.1% |
| 24.04 | 3 | < 0.1% |
| 24.49 | 2 | < 0.1% |
| 24.95 | 6 | |
| 25.4 | 3 | < 0.1% |
| 25.85 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 292.57 | 1 | |
| 290.3 | 2 | |
| 285 | 1 | |
| 284.86 | 1 | |
| 281.68 | 1 | |
| 281 | 1 | |
| 280.32 | 1 | |
| 280 | 1 | |
| 278.96 | 1 | |
| 276.24 | 1 |
BMI
Real number (ℝ)
MISSING 
| Distinct | 3985 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 48806 |
| Missing (%) | 11.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.529842 |
| Minimum | 12.02 |
|---|---|
| Maximum | 99.64 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.4 MiB |
Quantile statistics
| Minimum | 12.02 |
|---|---|
| 5-th percentile | 20.12 |
| Q1 | 24.13 |
| median | 27.44 |
| Q3 | 31.75 |
| 95-th percentile | 40.69 |
| Maximum | 99.64 |
| Range | 87.62 |
| Interquartile range (IQR) | 7.62 |
Descriptive statistics
| Standard deviation | 6.5548887 |
|---|---|
| Coefficient of variation (CV) | 0.22975552 |
| Kurtosis | 4.4283868 |
| Mean | 28.529842 |
| Median Absolute Deviation (MAD) | 3.73 |
| Skewness | 1.3877393 |
| Sum | 11307118 |
| Variance | 42.966565 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 26.63 | 4262 | 1.0% |
| 27.46 | 3277 | 0.7% |
| 24.41 | 3188 | 0.7% |
| 27.44 | 3128 | 0.7% |
| 27.12 | 3123 | 0.7% |
| 25.1 | 2726 | 0.6% |
| 32.28 | 2417 | 0.5% |
| 29.53 | 2334 | 0.5% |
| 25.84 | 2331 | 0.5% |
| 29.29 | 2308 | 0.5% |
| Other values (3975) | 367232 | |
| (Missing) | 48806 | 11.0% |
| Value | Count | Frequency (%) |
| 12.02 | 1 | < 0.1% |
| 12.05 | 1 | < 0.1% |
| 12.06 | 1 | < 0.1% |
| 12.11 | 3 | |
| 12.15 | 1 | < 0.1% |
| 12.16 | 5 | |
| 12.19 | 1 | < 0.1% |
| 12.2 | 1 | < 0.1% |
| 12.21 | 3 | |
| 12.24 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 99.64 | 1 | < 0.1% |
| 99.34 | 1 | < 0.1% |
| 97.65 | 5 | |
| 97.43 | 1 | < 0.1% |
| 96.2 | 1 | < 0.1% |
| 95.66 | 2 | < 0.1% |
| 94.66 | 1 | < 0.1% |
| 93.88 | 2 | < 0.1% |
| 93.51 | 1 | < 0.1% |
| 93.41 | 1 | < 0.1% |
AlcoholDrinkers
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 46574 |
| Missing (%) | 10.5% |
| Memory size | 869.5 KiB |
| True | |
|---|---|
| False | |
| (Missing) |
| Value | Count | Frequency (%) |
| True | 210891 | |
| False | 187667 | |
| (Missing) | 46574 | 10.5% |
HIVTesting
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 66127 |
| Missing (%) | 14.9% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 249919 | |
| True | 129086 | |
| (Missing) | 66127 | 14.9% |
FluVaxLast12
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 47121 |
| Missing (%) | 10.6% |
| Memory size | 869.5 KiB |
| True | |
|---|---|
| False | |
| (Missing) |
| Value | Count | Frequency (%) |
| True | 209256 | |
| False | 188755 | |
| (Missing) | 47121 | 10.6% |
PneumoVaxEver
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 77040 |
| Missing (%) | 17.3% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 215604 | |
| True | 152488 | |
| (Missing) | 77040 | 17.3% |
TetanusLast10Tdap
Categorical
MISSING 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 82516 |
| Missing (%) | 18.5% |
| Memory size | 3.4 MiB |
| No, did not receive any tetanus shot in the past 10 years | |
|---|---|
| Yes, received tetanus shot but not sure what type | |
| Yes, received Tdap | |
| Yes, received tetanus shot, but not Tdap |
Length
| Max length | 57 |
|---|---|
| Median length | 49 |
| Mean length | 42.454828 |
| Min length | 18 |
Characters and Unicode
| Total characters | 15394800 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Yes, received tetanus shot but not sure what type |
|---|---|
| 2nd row | No, did not receive any tetanus shot in the past 10 years |
| 3rd row | No, did not receive any tetanus shot in the past 10 years |
| 4th row | No, did not receive any tetanus shot in the past 10 years |
| 5th row | No, did not receive any tetanus shot in the past 10 years |
Common Values
| Value | Count | Frequency (%) |
| No, did not receive any tetanus shot in the past 10 years | 121493 | |
| Yes, received tetanus shot but not sure what type | 113725 | |
| Yes, received Tdap | 99943 | |
| Yes, received tetanus shot, but not Tdap | 27455 | 6.2% |
| (Missing) | 82516 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| not | 262673 | 8.8% |
| tetanus | 262673 | 8.8% |
| shot | 262673 | 8.8% |
| received | 241123 | 8.1% |
| yes | 241123 | 8.1% |
| but | 141180 | 4.7% |
| tdap | 127398 | 4.3% |
| 10 | 121493 | 4.1% |
| years | 121493 | 4.1% |
| no | 121493 | 4.1% |
| Other values (9) | 1070133 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2610839 | ||
| e | 2062080 | |
| t | 1662308 | |
| s | 1123180 | 7.3% |
| a | 868275 | 5.6% |
| n | 768332 | 5.0% |
| o | 646839 | 4.2% |
| d | 611507 | 4.0% |
| i | 605602 | 3.9% |
| r | 597834 | 3.9% |
| Other values (14) | 3838004 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11660890 | |
| Space Separator | 2610839 | 17.0% |
| Uppercase Letter | 490014 | 3.2% |
| Other Punctuation | 390071 | 2.5% |
| Decimal Number | 242986 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2062080 | |
| t | 1662308 | |
| s | 1123180 | |
| a | 868275 | 7.4% |
| n | 768332 | 6.6% |
| o | 646839 | 5.5% |
| d | 611507 | 5.2% |
| i | 605602 | 5.2% |
| r | 597834 | 5.1% |
| u | 517578 | 4.4% |
| Other values (7) | 2197355 |
Uppercase Letter
| Value | Count | Frequency (%) |
| Y | 241123 | |
| T | 127398 | |
| N | 121493 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 121493 | |
| 1 | 121493 |
Space Separator
| Value | Count | Frequency (%) |
| 2610839 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 390071 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12150904 | |
| Common | 3243896 | 21.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2062080 | |
| t | 1662308 | |
| s | 1123180 | |
| a | 868275 | 7.1% |
| n | 768332 | 6.3% |
| o | 646839 | 5.3% |
| d | 611507 | 5.0% |
| i | 605602 | 5.0% |
| r | 597834 | 4.9% |
| u | 517578 | 4.3% |
| Other values (10) | 2687369 |
Common
| Value | Count | Frequency (%) |
| 2610839 | ||
| , | 390071 | 12.0% |
| 0 | 121493 | 3.7% |
| 1 | 121493 | 3.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15394800 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2610839 | ||
| e | 2062080 | |
| t | 1662308 | |
| s | 1123180 | 7.3% |
| a | 868275 | 5.6% |
| n | 768332 | 5.0% |
| o | 646839 | 4.2% |
| d | 611507 | 4.0% |
| i | 605602 | 3.9% |
| r | 597834 | 3.9% |
| Other values (14) | 3838004 |
HighRiskLastYear
Boolean
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 50623 |
| Missing (%) | 11.4% |
| Memory size | 869.5 KiB |
| False | |
|---|---|
| True | 17185 |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 377324 | |
| True | 17185 | 3.9% |
| (Missing) | 50623 | 11.4% |
CovidPos
Categorical
MISSING 
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 50764 |
| Missing (%) | 11.4% |
| Memory size | 3.4 MiB |
| No | |
|---|---|
| Yes | |
| Tested positive using home test without a health professional | 13436 |
Length
| Max length | 61 |
|---|---|
| Median length | 2 |
| Mean length | 4.2912635 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1692337 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No |
|---|---|
| 2nd row | No |
| 3rd row | Yes |
| 4th row | No |
| 5th row | No |
Common Values
| Value | Count | Frequency (%) |
| No | 270055 | |
| Yes | 110877 | |
| Tested positive using home test without a health professional | 13436 | 3.0% |
| (Missing) | 50764 | 11.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| no | 270055 | |
| yes | 110877 | |
| tested | 13436 | 2.7% |
| positive | 13436 | 2.7% |
| using | 13436 | 2.7% |
| home | 13436 | 2.7% |
| test | 13436 | 2.7% |
| without | 13436 | 2.7% |
| a | 13436 | 2.7% |
| health | 13436 | 2.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 337235 | |
| N | 270055 | |
| e | 204929 | |
| s | 191493 | |
| Y | 110877 | 6.6% |
| 107488 | 6.4% | |
| t | 94052 | 5.6% |
| i | 67180 | 4.0% |
| h | 53744 | 3.2% |
| a | 40308 | 2.4% |
| Other values (12) | 214976 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1190481 | |
| Uppercase Letter | 394368 | 23.3% |
| Space Separator | 107488 | 6.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 337235 | |
| e | 204929 | |
| s | 191493 | |
| t | 94052 | 7.9% |
| i | 67180 | 5.6% |
| h | 53744 | 4.5% |
| a | 40308 | 3.4% |
| l | 26872 | 2.3% |
| p | 26872 | 2.3% |
| u | 26872 | 2.3% |
| Other values (8) | 120924 | 10.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 270055 | |
| Y | 110877 | |
| T | 13436 | 3.4% |
Space Separator
| Value | Count | Frequency (%) |
| 107488 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1584849 | |
| Common | 107488 | 6.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 337235 | |
| N | 270055 | |
| e | 204929 | |
| s | 191493 | |
| Y | 110877 | 7.0% |
| t | 94052 | 5.9% |
| i | 67180 | 4.2% |
| h | 53744 | 3.4% |
| a | 40308 | 2.5% |
| l | 26872 | 1.7% |
| Other values (11) | 188104 |
Common
| Value | Count | Frequency (%) |
| 107488 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1692337 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 337235 | |
| N | 270055 | |
| e | 204929 | |
| s | 191493 | |
| Y | 110877 | 6.6% |
| 107488 | 6.4% | |
| t | 94052 | 5.6% |
| i | 67180 | 4.0% |
| h | 53744 | 3.2% |
| a | 40308 | 2.4% |
| Other values (12) | 214976 |
| State | Sex | GeneralHealth | PhysicalHealthDays | MentalHealthDays | LastCheckupTime | PhysicalActivities | SleepHours | RemovedTeeth | HadHeartAttack | HadAngina | HadStroke | HadAsthma | HadSkinCancer | HadCOPD | HadDepressiveDisorder | HadKidneyDisease | HadArthritis | HadDiabetes | DeafOrHardOfHearing | BlindOrVisionDifficulty | DifficultyConcentrating | DifficultyWalking | DifficultyDressingBathing | DifficultyErrands | SmokerStatus | ECigaretteUsage | ChestScan | RaceEthnicityCategory | AgeCategory | HeightInMeters | WeightInKilograms | BMI | AlcoholDrinkers | HIVTesting | FluVaxLast12 | PneumoVaxEver | TetanusLast10Tdap | HighRiskLastYear | CovidPos | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Alabama | Female | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | No | 8.0 | NaN | No | No | No | No | No | No | No | No | No | Yes | No | No | No | No | No | No | Never smoked | Not at all (right now) | No | White only, Non-Hispanic | Age 80 or older | NaN | NaN | NaN | No | No | Yes | No | Yes, received tetanus shot but not sure what type | No | No |
| 1 | Alabama | Female | Excellent | 0.0 | 0.0 | NaN | No | 6.0 | NaN | No | No | No | No | Yes | No | No | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | White only, Non-Hispanic | Age 80 or older | 1.60 | 68.04 | 26.57 | No | No | No | No | No, did not receive any tetanus shot in the past 10 years | No | No |
| 2 | Alabama | Female | Very good | 2.0 | 3.0 | Within past year (anytime less than 12 months ago) | Yes | 5.0 | NaN | No | No | No | No | Yes | No | No | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | White only, Non-Hispanic | Age 55 to 59 | 1.57 | 63.50 | 25.61 | No | No | No | No | NaN | No | Yes |
| 3 | Alabama | Female | Excellent | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | NaN | No | No | No | Yes | No | No | No | No | Yes | No | No | No | No | No | No | No | Current smoker - now smokes some days | Never used e-cigarettes in my entire life | Yes | White only, Non-Hispanic | NaN | 1.65 | 63.50 | 23.30 | No | No | Yes | Yes | No, did not receive any tetanus shot in the past 10 years | No | No |
| 4 | Alabama | Female | Fair | 2.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 9.0 | NaN | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | Yes | White only, Non-Hispanic | Age 40 to 44 | 1.57 | 53.98 | 21.77 | Yes | No | No | Yes | No, did not receive any tetanus shot in the past 10 years | No | No |
| 5 | Alabama | Male | Poor | 1.0 | 0.0 | Within past year (anytime less than 12 months ago) | No | 7.0 | NaN | Yes | No | Yes | No | No | No | No | No | No | Yes | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | White only, Non-Hispanic | Age 80 or older | 1.80 | 84.82 | 26.08 | No | No | No | Yes | No, did not receive any tetanus shot in the past 10 years | No | No |
| 6 | Alabama | Female | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | NaN | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | Former smoker | Never used e-cigarettes in my entire life | No | Black only, Non-Hispanic | Age 80 or older | 1.65 | 62.60 | 22.96 | Yes | No | No | No | No, did not receive any tetanus shot in the past 10 years | No | No |
| 7 | Alabama | Female | Good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | No | 8.0 | NaN | No | No | No | No | No | No | No | No | Yes | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | Yes | White only, Non-Hispanic | Age 80 or older | 1.63 | 73.48 | 27.81 | No | No | Yes | Yes | Yes, received tetanus shot but not sure what type | No | No |
| 8 | Alabama | Female | Good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 6.0 | NaN | No | No | No | No | Yes | No | No | No | Yes | No | No | Yes | No | Yes | No | No | Former smoker | Not at all (right now) | NaN | White only, Non-Hispanic | Age 75 to 79 | 1.70 | NaN | NaN | No | Yes | No | No | Yes, received tetanus shot but not sure what type | No | No |
| 9 | Alabama | Female | Good | 1.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | NaN | No | No | No | No | No | No | No | Yes | No | Yes | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | NaN | White only, Non-Hispanic | Age 70 to 74 | 1.68 | 81.65 | 29.05 | Yes | NaN | Yes | Yes | No, did not receive any tetanus shot in the past 10 years | No | No |
| State | Sex | GeneralHealth | PhysicalHealthDays | MentalHealthDays | LastCheckupTime | PhysicalActivities | SleepHours | RemovedTeeth | HadHeartAttack | HadAngina | HadStroke | HadAsthma | HadSkinCancer | HadCOPD | HadDepressiveDisorder | HadKidneyDisease | HadArthritis | HadDiabetes | DeafOrHardOfHearing | BlindOrVisionDifficulty | DifficultyConcentrating | DifficultyWalking | DifficultyDressingBathing | DifficultyErrands | SmokerStatus | ECigaretteUsage | ChestScan | RaceEthnicityCategory | AgeCategory | HeightInMeters | WeightInKilograms | BMI | AlcoholDrinkers | HIVTesting | FluVaxLast12 | PneumoVaxEver | TetanusLast10Tdap | HighRiskLastYear | CovidPos | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 445122 | Virgin Islands | Male | Fair | 30.0 | 1.0 | Within past year (anytime less than 12 months ago) | No | 6.0 | 6 or more, but not all | No | NaN | Yes | No | No | Yes | No | No | No | No, pre-diabetes or borderline diabetes | No | No | No | No | No | No | Former smoker | Never used e-cigarettes in my entire life | Yes | White only, Non-Hispanic | Age 70 to 74 | 1.78 | 70.31 | 22.24 | No | No | Yes | NaN | Yes, received tetanus shot but not sure what type | No | Yes |
| 445123 | Virgin Islands | Female | Fair | 0.0 | 7.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | None of them | No | No | No | No | No | No | Yes | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | Black only, Non-Hispanic | Age 25 to 29 | 1.93 | 90.72 | 24.34 | No | No | No | No | No, did not receive any tetanus shot in the past 10 years | No | Yes |
| 445124 | Virgin Islands | Male | Good | 0.0 | 15.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | 1 to 5 | No | No | Yes | No | No | No | No | No | Yes | Yes | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | Multiracial, Non-Hispanic | Age 65 to 69 | 1.68 | 83.91 | 29.86 | Yes | Yes | Yes | Yes | Yes, received tetanus shot but not sure what type | No | Yes |
| 445125 | Virgin Islands | Male | Good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 5.0 | NaN | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | Former smoker | Never used e-cigarettes in my entire life | NaN | Black only, Non-Hispanic | Age 65 to 69 | 1.68 | 74.84 | 26.63 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 445126 | Virgin Islands | Male | Good | 0.0 | 0.0 | Within past 2 years (1 year but less than 2 years ago) | Yes | 8.0 | None of them | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | White only, Non-Hispanic | Age 30 to 34 | 1.83 | 104.33 | 31.19 | Yes | NaN | No | No | NaN | No | Yes |
| 445127 | Virgin Islands | Female | Good | 0.0 | 3.0 | Within past 2 years (1 year but less than 2 years ago) | Yes | 6.0 | None of them | No | No | No | Yes | No | No | Yes | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | Yes | Black only, Non-Hispanic | Age 18 to 24 | 1.65 | 69.85 | 25.63 | NaN | Yes | No | No | No, did not receive any tetanus shot in the past 10 years | No | Yes |
| 445128 | Virgin Islands | Female | Excellent | 2.0 | 2.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | None of them | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | No | Black only, Non-Hispanic | Age 50 to 54 | 1.70 | 83.01 | 28.66 | No | Yes | Yes | No | Yes, received tetanus shot but not sure what type | No | No |
| 445129 | Virgin Islands | Female | Poor | 30.0 | 30.0 | 5 or more years ago | No | 5.0 | 1 to 5 | No | No | No | No | No | No | No | No | No | No | No | No | NaN | No | No | No | Current smoker - now smokes every day | Use them some days | NaN | NaN | Age 65 to 69 | 1.70 | 49.90 | 17.23 | NaN | No | No | No | No, did not receive any tetanus shot in the past 10 years | No | No |
| 445130 | Virgin Islands | Male | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | No | 5.0 | None of them | Yes | No | No | Yes | No | No | No | No | No | No | No | No | No | No | No | No | Never smoked | Never used e-cigarettes in my entire life | Yes | Black only, Non-Hispanic | Age 70 to 74 | 1.83 | 108.86 | 32.55 | No | Yes | Yes | Yes | No, did not receive any tetanus shot in the past 10 years | No | Yes |
| 445131 | Virgin Islands | Male | Very good | 0.0 | 1.0 | NaN | Yes | 5.0 | None of them | No | No | No | No | No | No | No | No | No | No | No | No | Yes | Yes | No | No | Former smoker | Not at all (right now) | Yes | Black only, Non-Hispanic | Age 40 to 44 | 1.68 | 63.50 | 22.60 | Yes | No | No | No | Yes, received tetanus shot but not sure what type | No | No |
Most frequently occurring
| State | Sex | GeneralHealth | PhysicalHealthDays | MentalHealthDays | LastCheckupTime | PhysicalActivities | SleepHours | RemovedTeeth | HadHeartAttack | HadAngina | HadStroke | HadAsthma | HadSkinCancer | HadCOPD | HadDepressiveDisorder | HadKidneyDisease | HadArthritis | HadDiabetes | DeafOrHardOfHearing | BlindOrVisionDifficulty | DifficultyConcentrating | DifficultyWalking | DifficultyDressingBathing | DifficultyErrands | SmokerStatus | ECigaretteUsage | ChestScan | RaceEthnicityCategory | AgeCategory | HeightInMeters | WeightInKilograms | BMI | AlcoholDrinkers | HIVTesting | FluVaxLast12 | PneumoVaxEver | TetanusLast10Tdap | HighRiskLastYear | CovidPos | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15 | Connecticut | Male | Excellent | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | White only, Non-Hispanic | Age 55 to 59 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4 |
| 45 | Louisiana | Female | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4 |
| 7 | Colorado | Male | Good | 0.0 | 0.0 | NaN | Yes | 8.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Hispanic | Age 18 to 24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 24 | Florida | Female | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 8.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | White only, Non-Hispanic | Age 65 to 69 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 62 | Minnesota | Female | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | White only, Non-Hispanic | Age 60 to 64 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 89 | New York | Male | Excellent | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | White only, Non-Hispanic | Age 50 to 54 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 92 | New York | Male | Good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 8.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Hispanic | Age 40 to 44 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 100 | Ohio | Male | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 7.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | White only, Non-Hispanic | Age 45 to 49 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 104 | South Carolina | Male | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 8.0 | 1 to 5 | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | White only, Non-Hispanic | Age 75 to 79 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 114 | Texas | Male | Very good | 0.0 | 0.0 | Within past year (anytime less than 12 months ago) | Yes | 8.0 | None of them | No | No | No | No | No | No | No | No | No | No | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | Hispanic | Age 18 to 24 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |